DISCO-SCA and Properly Applied GSVD as Swinging Methods to Find Common and Distinctive Processes

نویسندگان

  • Katrijn Van Deun
  • Iven Van Mechelen
  • Lieven Thorrez
  • Martijn Schouteden
  • Bart De Moor
  • Mariët J. van der Werf
  • Lieven De Lathauwer
  • Age K. Smilde
  • Henk A. L. Kiers
چکیده

BACKGROUND In systems biology it is common to obtain for the same set of biological entities information from multiple sources. Examples include expression data for the same set of orthologous genes screened in different organisms and data on the same set of culture samples obtained with different high-throughput techniques. A major challenge is to find the important biological processes underlying the data and to disentangle therein processes common to all data sources and processes distinctive for a specific source. Recently, two promising simultaneous data integration methods have been proposed to attain this goal, namely generalized singular value decomposition (GSVD) and simultaneous component analysis with rotation to common and distinctive components (DISCO-SCA). RESULTS Both theoretical analyses and applications to biologically relevant data show that: (1) straightforward applications of GSVD yield unsatisfactory results, (2) DISCO-SCA performs well, (3) provided proper pre-processing and algorithmic adaptations, GSVD reaches a performance level similar to that of DISCO-SCA, and (4) DISCO-SCA is directly generalizable to more than two data sources. The biological relevance of DISCO-SCA is illustrated with two applications. First, in a setting of comparative genomics, it is shown that DISCO-SCA recovers a common theme of cell cycle progression and a yeast-specific response to pheromones. The biological annotation was obtained by applying Gene Set Enrichment Analysis in an appropriate way. Second, in an application of DISCO-SCA to metabolomics data for Escherichia coli obtained with two different chemical analysis platforms, it is illustrated that the metabolites involved in some of the biological processes underlying the data are detected by one of the two platforms only; therefore, platforms for microbial metabolomics should be tailored to the biological question. CONCLUSIONS Both DISCO-SCA and properly applied GSVD are promising integrative methods for finding common and distinctive processes in multisource data. Open source code for both methods is provided.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simplified GSVD computations for the solution of linear discrete ill-posed problems

The generalized singular value decomposition (GSVD) often is used to solve Tikhonov regularization problems with a regularization matrix without exploitable structure. This paper describes how the standard methods for the computation of the GSVD of a matrix pair can be simplified in the context of Tikhonov regularization. Also, other regularization methods, including truncated GSVD, are conside...

متن کامل

Improved speech enhancement by applying time-shift property of DFT on hankel matrices for signal subspace decomposition

In previous studies, the signal subspace technique for speech enhancement was extended and a perceptually constrained generalized singular value decomposition (PCGSVD)-based algorithm [1] was developed which properly integrated the auditory masking effect and the GSVD algorithm. Both objective measures and subjective tests verified that this approach can offer better performance than the GSVD-b...

متن کامل

Multi-objective planning of charging stations considering benefits of distribution company and charging stations owners

In recent years, electric vehicles have attracted significant attention due to environmental issues. Charging stations installation requires a systematic consideration of relevant issues such as determination of the location and size of charging stations. On the other hand, it is necessary to encourage private investors to invest in charging stations installations and to provide proper conditio...

متن کامل

A Higher-Order Generalized Singular Value Decomposition for Comparison of Global mRNA Expression from Multiple Organisms

The number of high-dimensional datasets recording multiple aspects of a single phenomenon is increasing in many areas of science, accompanied by a need for mathematical frameworks that can compare multiple large-scale matrices with different row dimensions. The only such framework to date, the generalized singular value decomposition (GSVD), is limited to two matrices. We mathematically define ...

متن کامل

On A New Approach for Self-optimizing Control Structure Design

In this paper, a new method for the identification of self-optimizing control structure designs (CSDs) based on generalized singular value decomposition (GSVD) is proposed. The method is primarily dedicated to find optimal CSDs where all controlled variables (CVs) are represented by a common set of linear combinations of process variables (PVs). It is shown that the implementation of the GSVD i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2012